Enqueued related words: Aligned Corpus, Comparable Corpus, Bitext, Bilingual Corpus, Sentence Alignment, Translation Memory

Parallel Corpus

释义 (Definition)

parallel corpus（平行语料库）：指由两种或多种语言的对应文本组成的语料库，文本之间通常是句子级或段落级对齐，常用于翻译研究、机器翻译、双语词典抽取与跨语言信息检索等。也常简称为 bitext（双语对照文本）。

发音 (IPA)

/ˈpærəˌlɛl ˈkɔːrpəs/

例句 (Examples)

We trained the translation model on a large parallel corpus.
我们用一个大型平行语料库来训练翻译模型。

Because the parallel corpus is sentence-aligned, we can automatically learn translation equivalents and compare how the same idea is expressed across languages.
由于平行语料库按句子对齐，我们可以自动学习对应译法，并比较同一观点在不同语言中的表达方式。

词源 (Etymology)

parallel 源自希腊语 parallēlos（意为“并行的、并排的”），经由拉丁语与法语进入英语；corpus 源自拉丁语 corpus（意为“身体、整体”，引申为“文本集合/资料库”）。合起来即“并行（对照）的文本集合”。

文学与著作中的用例 (Notable Works)